A simulated annealing approach to speaker segmentation in audio databases

نویسندگان

  • José M. Leiva-Murillo
  • Sancho Salcedo-Sanz
  • Ascensión Gallardo-Antolín
  • Antonio Artés-Rodríguez
چکیده

In this paper we present a novel approach to the problem of speaker segmentation, which is an unavoidable previous step to audio indexing. Mutual information is used for evaluating the accuracy of the segmentation, as a function to be maximized by a simulated annealing (SA) algorithm. We introduce a novel mutation operator for the SA, the Consecutive Bits Mutation operator, which improves the performance of the SA in this problem. We also use the so-called Compaction Factor, which allows the SA to operate in a reduced search space. Our algorithm has been tested in the segmentation of real audio databases, and it has been compared to several existing algorithms for speaker segmentation, obtaining very good results in the test problems considered. r 2007 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Speaker Segmentation using Autoassociative Neural Network

In this paper we propose an unsupervised approach to speaker segmentation using autoassociative neural network (AANN). Speaker segmentation aims at finding speaker change points in a speech signal which is an important preprocessing step to audio indexing, spoken document retrieval and multi speaker diarization. The method extracts the speaker specific information from the Mel frequency cepstra...

متن کامل

Unsupervised Speaker Segmentation using Autoassociative Neural Network

In this paper we propose an unsupervised approach to speaker segmentation using autoassociative neural network (AANN). Speaker segmentation aims at finding speaker change points in a speech signal which is an important preprocessing step to audio indexing, spoken document retrieval and multi speaker diarization. The method extracts the speaker specific information from the Mel frequency cepstra...

متن کامل

Automatic Building of Synthetic Voices from Audio Books

Current state-of-the-art text-to-speech systems produce intelligible speech but lack the prosody of natural utterances. Building better models of prosody involves development of prosodically rich speech databases. However, development of such speech databases requires a large amount of effort and time. An alternative is to exploit story style monologues (long speech files) in audio books. These...

متن کامل

Using design of experiments approach and simulated annealing algorithm for modeling and Optimization of EDM process parameters

The main objectives of this research are, therefore, to assess the effects of process parameters and to determine their optimal levels machining of Inconel 718 super alloy. gap voltage, current, time of machining and duty factor are tuning parameters considered to be study as process input parameters. Furthermore, two important process output characteristic, have been evaluated in this research...

متن کامل

Sequencing Mixed Model Assembly Line Problem to Minimize Line Stoppages Cost by a Modified Simulated Annealing Algorithm Based on Cloud Theory

This research presents a new application of the cloud theory-based simulated annealing algorithm to solve mixed model assembly line sequencing problems where line stoppage cost is expected to be optimized. This objective is highly significant in mixed model assembly line sequencing problems based on just-in-time production system. Moreover, this type of problem is NP-hard and solving this probl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Eng. Appl. of AI

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2008